Functional Annotation of Genes Using Hierarchical Text Categorization

نویسندگان

  • Svetlana Kiritchenko
  • Stan Matwin
  • Fazel Famili
چکیده

This paper addresses the task of functional annotation of genes from biomedical literature. We view this task as a hierarchical text categorization problem with Gene Ontology as a class hierarchy. We present a novel global hierarchical learning approach that takes into account the semantics of a class hierarchy. This algorithm with AdaBoost as the underlying learning procedure significantly outperforms the corresponding “flat” approach, i.e. the approach that does not consider any hierarchical information. In addition, we propose a novel hierarchical evaluation measure that gives credit to partially correct classification and discriminates errors by both distance and depth in a class hierarchy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Text Categorization and Its Application to Bioinformatics

In a hierarchical categorization problem, categories are partially ordered to form a hierarchy. In this dissertation, we explore two main aspects of hierarchical categorization: learning algorithms and performance evaluation. We introduce the notion of consistent hierarchical classification that makes classification results more comprehensible and easily interpretable for end-users. Among the p...

متن کامل

DIMACS at the TREC 2004 Genomics Track

DIMACS participated in the text categorization and ad hoc retrieval tasks of the TREC 2004 Genomics track. For the categorization task, we tackled the triage and annotation hierarchy subtasks. 1. TEXT CATEGORIZATION TASK The Mouse Genome Informatics (MGI) project of the Jackson Laboratory provides data on the genetics, genomics, and biology of the laboratory mouse. In particular, the Mouse Geno...

متن کامل

iDEP: An integrated web application for differential expression and pathway analysis

iDEP (integrated Differential Expression and Pathway analysis) is a web application that reads in gene expression data from DNA microarray or RNA-Seq and performs exploratory data analysis (EDA), differential expression, and pathway analysis. The key idea of iDEP is to make many powerful R/Bioconductor packages easily accessible by wrapping them under a graphical interface, alongside annotation...

متن کامل

Hierarchical Text Categorization as a Tool of Associating Genes with Gene Ontology Codes

A great deal of genomics information accumulated through years is available nowadays in on-line text repositories such as Medline. These resources are essential for biomedical researchers in their everyday activities on planning and performing experiments and verifying the results. However, these resources do not still provide adequate mechanisms for retrieving the requisite information. We pro...

متن کامل

Functional Annotation of Hierarchical Modularity

In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modul...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005